18 research outputs found

    Formal Verification of Input-Output Mappings of Tree Ensembles

    Full text link
    Recent advances in machine learning and artificial intelligence are now being considered in safety-critical autonomous systems where software defects may cause severe harm to humans and the environment. Design organizations in these domains are currently unable to provide convincing arguments that their systems are safe to operate when machine learning algorithms are used to implement their software. In this paper, we present an efficient method to extract equivalence classes from decision trees and tree ensembles, and to formally verify that their input-output mappings comply with requirements. The idea is that, given that safety requirements can be traced to desirable properties on system input-output patterns, we can use positive verification outcomes in safety arguments. This paper presents the implementation of the method in the tool VoTE (Verifier of Tree Ensembles), and evaluates its scalability on two case studies presented in current literature. We demonstrate that our method is practical for tree ensembles trained on low-dimensional data with up to 25 decision trees and tree depths of up to 20. Our work also studies the limitations of the method with high-dimensional data and preliminarily investigates the trade-off between large number of trees and time taken for verification

    Fluid balance-adjusted creatinine in diagnosing acute kidney injury in the critically ill

    Get PDF
    Background Acute kidney injury (AKI) is often diagnosed based on plasma creatinine (Cr) only. Adjustment of Cr for cumulative fluid balance due to potential dilution of Cr and subsequently missed Cr-based diagnosis of AKI has been suggested, albeit the physiological rationale for these adjustments is questionable. Furthermore, whether these adjustments lead to a different incidence of AKI when used in conjunction with urine output (UO) criteria is unknown. Methods This was a post hoc analysis of the Finnish Acute Kidney Injury study. Hourly UO and daily plasma Cr were measured during the first 5 days of intensive care unit admission. Cr values were adjusted following the previously used formula and combined with the UO criteria. Resulting incidences and mortality rates were compared with the results based on unadjusted values. Results In total, 2044 critically ill patients were analyzed. The mean difference between the adjusted and unadjusted Cr of all 7279 observations was 5 (+/- 15) mu mol/L. Using adjusted Cr in combination with UO and renal replacement therapy criteria resulted in the diagnosis of 19 (1%) additional AKI patients. The absolute difference in the incidence was 0.9% (95% confidence interval [CI]: 0.3%-1.6%). Mortality rates were not significantly different between the reclassified AKI patients using the full set of Kidney Disease: Improving Global Outcomes criteria. Conclusion Fluid balance-adjusted Cr resulted in little change in AKI incidence, and only minor differences in mortality between patients who changed category after adjustment and those who did not. Using adjusted Cr values to diagnose AKI does not seem worthwhile in critically ill patients.Peer reviewe

    The management of adult patients with severe chronic small intestinal dysmotility

    Get PDF
    Adult patients with severe chronic small intestinal dysmotility are not uncommon and can be difficult to manage. This guideline gives an outline of how to make the diagnosis. It discusses factors which contribute to or cause a picture of severe chronic intestinal dysmotility (eg, obstruction, functional gastrointestinal disorders, drugs, psychosocial issues and malnutrition). It gives management guidelines for patients with an enteric myopathy or neuropathy including the use of enteral and parenteral nutritio

    A922 Sequential measurement of 1 hour creatinine clearance (1-CRCL) in critically ill patients at risk of acute kidney injury (AKI)

    Get PDF
    Meeting abstrac

    Improving Quality of Avionics Software Using Mutation Testing

    No full text
    Mutation testing is a powerful fault-based testing technique that makes syntactic changes to a program under test in order to simulate real faults otherwise caused by a programmer. Similar to structural coverage criteria such as statement coverage, mutation testing is used to assess the quality of a test suite. After a syntactic change has been made, the program is referred to as a mutant that either can survive a test suite, or be killed by one. If a mutant is killed, it means that the test suite has detected the syntactic change and reported it as an error, resulting in an increased mutation score. If a mutant survives, it means that the test suite failed to detect the fault and the mutation score is decreased. Mutation testing is generally considered the strongest testing technique available in terms of fault detection, but also the most expensive one. However, thanks to recent research and the rapid development of computing hardware, the testing technique is starting to become feasible, motivating the creation of tools utilizing the power of mutation testing. Saab AB, the Swedish aircraft manufacturer and stakeholder in this thesis, has experimented with mutation testing in the past, resulting in a tool called BAX that creates textual modifications of the original source code. The initial goal of this thesis is to provide a new tool that is faster than BAX, and that is more systematic in the way mutants are generated. LLVM-P86, the main contribution of this thesis, is a compiler and mutation testing framework intended for the programming language Pascal-86. Unlike BAX, LLVM-P86 is able to encode several mutants into a single program, thus reducing the time spent on compiling source code. In the conducted experiments, LLVM-P86 processed mutants significantly faster than BAX, on average by a factor of 13.6. Since LLVM-P86 is also a compiler, proper type information is available when mutants are generated. The additional type information allows LLVM-P86 to avoid a significant amount of equivalent mutants, i.e. mutants that behave in the same way as the original program. When mutating relational operators found in approximately 10,000 lines of code, distributed amongst 18 different Pascal-86 modules, LLVM-P86 was able to reduce the total number of living mutants by 25%, or 5.7% of the complete set of mutants

    Formal Verification of Tree Ensembles in Safety-Critical Applications

    No full text
    In the presence of data and computational resources, machine learning can be used to synthesize software automatically. For example, machines are now capable of learning complicated pattern recognition tasks and sophisticated decision policies, two key capabilities in autonomous cyber-physical systems. Unfortunately, humans find software synthesized by machine learning algorithms difficult to interpret, which currently limits their use in safety-critical applications such as medical diagnosis and avionic systems. In particular, successful deployments of safety-critical systems mandate the execution of rigorous verification activities, which often rely on human insights, e.g., to identify scenarios in which the system shall be tested. A natural pathway towards a viable verification strategy for such systems is to leverage formal verification techniques, which, in the presence of a formal specification, can provide definitive guarantees with little human intervention. However, formal verification suffers from scalability issues with respect to system complexity. In this thesis, we investigate the limits of current formal verification techniques when applied to a class of machine learning models called tree ensembles, and identify model-specific characteristics that can be exploited to improve the performance of verification algorithms when applied specifically to tree ensembles. To this end, we develop two formal verification techniques specifically for tree ensembles, one fast and conservative technique, and one exact but more computationally demanding. We then combine these two techniques into an abstraction-refinement approach, that we implement in a tool called VoTE (Verifier of Tree Ensembles). Using a couple of case studies, we recognize that sets of inputs that lead to the same system behavior can be captured precisely as hyperrectangles, which enables tractable enumeration of input-output mappings when the input dimension is low. Tree ensembles with a high-dimensional input domain, however, seems generally difficult to verify. In some cases though, conservative approximations of input-output mappings can greatly improve performance. This is demonstrated in a digit recognition case study, where we assess the robustness of classifiers when confronted with additive noise
    corecore